Constraint based Cluster Ensemble to Detect Outliers in Medical Datasets
نویسنده
چکیده
Outlier analysis in medical datasets can reveal very significant traits regarding behavioral pattern of genes. Presence of outliers may indicate symptoms of genetic disorders or mutant tumors. In case of genetic disorders, designing curative medicines is possible only after studying the gene-gene and gene-tumor relationships. This means that identification of outlier observations alone is insufficient to clarify the source of outliers, i.e. to which tumors they are related. Most of the existing works adopt single clustering algorithms to detect outlier patterns from bio-molecular data. However, single clustering algorithms lack robustness, stability and accuracy. This work uses a form of semi-supervised cluster ensemble to analyze outlier patterns based on their relations to clusters. Specifically, the prior knowledge of a dataset is fed to the cluster ensemble in the form of constraints. The clusters produced are analyzed for detecting outliers by filtering out insignificant clusters. Then, the outlier-cluster association is calculated using a fuzzy approach. The combined fuzzyconstraint based cluster ensemble approach can be used to effectively analyze outliers in medical datasets.
منابع مشابه
Wised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملHigh-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملWised Semi-Supervised Cluster Ensemble Selection: A New Framework for Selecting and Combing Multiple Partitions Based on Prior knowledge
The Wisdom of Crowds, an innovative theory described in social science, claims that the aggregate decisions made by a group will often be better than those of its individual members if the four fundamental criteria of this theory are satisfied. This theory used for in clustering problems. Previous researches showed that this theory can significantly increase the stability and performance of...
متن کاملA Pre-Trained Ensemble Model for Breast Cancer Grade Detection Based on Small Datasets
Background and Purpose: Nowadays, breast cancer is reported as one of the most common cancers amongst women. Early detection of the cancer type is essential to aid in informing subsequent treatments. The newest proposed breast cancer detectors are based on deep learning. Most of these works focus on large-datasets and are not developed for small datasets. Although the large datasets might lead ...
متن کاملA new ensemble clustering method based on fuzzy cmeans clustering while maintaining diversity in ensemble
An ensemble clustering has been considered as one of the research approaches in data mining, pattern recognition, machine learning and artificial intelligence over the last decade. In clustering, the combination first produces several bases clustering, and then, for their aggregation, a function is used to create a final cluster that is as similar as possible to all the cluster bundles. The inp...
متن کامل